167 research outputs found

    Validation of a quantifier-based fuzzy classification system for breast cancer patients on external independent cohorts

    Get PDF
    Recent studies in breast cancer domains have identified seven distinct clinical phenotypes (groups) using immunohistochemical analysis and a variety of unsupervised learning techniques. Consensus among the clustering algorithms has been used to categorise patients into these specific groups, but often at the expenses of not classifying all patients. It is known that fuzzy methodologies can provide linguistic based classification rules to ease those from consensus clustering. The objective of this study is to present the validation of a recently developed extension of a fuzzy quantification subsethood-based algorithm on three sets of newly available breast cancer data. Results show that our algorithm is able to reproduce the seven biological classes previously identified, preserving their characterisation in terms of marker distributions and therefore their clinical meaning. Moreover, because our algorithm constitutes the fundamental basis of the newly developed Nottingham Prognostic Index Plus (NPI+), our findings demonstrate that this new medical decision making tool can help moving towards a more tailored care in breast cancer

    A novel semi-supervised algorithm for rare prescription side effect discovery

    Get PDF
    Drugs are frequently prescribed to patients with the aim of improving each patient's medical state, but an unfortunate consequence of most prescription drugs is the occurrence of undesirable side effects. Side effects that occur in more than one in a thousand patients are likely to be signalled efficiently by current drug surveillance methods, however, these same methods may take decades before generating signals for rarer side effects, risking medical morbidity or mortality in patients prescribed the drug while the rare side effect is undiscovered. In this paper we propose a novel computational meta-analysis framework for signalling rare side effects that integrates existing methods, knowledge from the web, metric learning and semi-supervised clustering. The novel framework was able to signal many known rare and serious side effects for the selection of drugs investigated, such as tendon rupture when prescribed Ciprofloxacin or Levofloxacin, renal failure with Naproxen and depression associated with Rimonabant. Furthermore, for the majority of the drug investigated it generated signals for rare side effects at a more stringent signalling threshold than existing methods and shows the potential to become a fundamental part of post marketing surveillance to detect rare side effects

    Fuzzy Integral Driven Ensemble Classification using A Priori Fuzzy Measures

    Get PDF
    Aggregation operators are mathematical functions that enable the fusion of information from multiple sources. Fuzzy Integrals (FIs) are widely used aggregation operators, which combine information in respect to a Fuzzy Measure (FM) which captures the worth of both the individual sources and all their possible combinations. However, FIs suffer from the potential drawback of not fusing information according to the intuitively interpretable FM, leading to non-intuitive results. The latter is particularly relevant when a FM has been defined using external information (e.g. experts). In order to address this and provide an alternative to the FI, the Recursive Average (RAV) aggregation operator was recently proposed which enables intuitive data fusion in respect to a given FM. With an alternative fusion operator in place, in this paper, we define the concept of ‘A Priori’ FMs which are generated based on external information (e.g. classification accuracy) and thus provide an alternative to the traditional approaches of learning or manually specifying FMs. We proceed to develop one specific instance of such an a priori FM to support the decision level fusion step in ensemble classification. We evaluate the resulting approach by contrasting the performance of the ensemble classifiers for different FMs, including the recently introduced Uriz and the Sugeno lambda-measure; as well as by employing both the Choquet FI and the RAV as possible fusion operators. Results are presented for 20 datasets from machine learning repositories and contextualised to the wider literature by comparing them to state-of-the-art ensemble classifiers such as Adaboost, Bagging, Random Forest and Majority Voting

    Signalling paediatric side effects using an ensemble of simple study designs

    Get PDF
    Background: Children are frequently prescribed medication `o-label', meaning there has not been sucient testing of the medication to determine its safety or eectiveness. The main reason this safety knowledge is lacking is due to ethical restrictions that prevent children from being included in the majority of clinical trials. Methods: Multiple measures of association are calculated for each drug and medical event pair and these are used as features that are fed into a classifier to determine the likelihood of the drug and medical event pair corresponding to an adverse drug reaction. The classier is trained using known adverse drug reactions or known non-adverse drug reaction relationships. Results: The novel ensemble framework obtained a false positive rate of 0:149, a sensitivity of 0:547 and a specificity of 0:851 when implemented on a reference set of drug and medical event pairs. The novel framework consistently outperformed each individual simple study design. Conclusion: This research shows that it is possible to exploit the mechanism of causality and presents a framework for signalling adverse drug reactions eectively

    Interpretability indices for hierarchical fuzzy systems

    Get PDF
    Hierarchical fuzzy systems (HFSs) have been shown to have the potential to improve interpretability of fuzzy logic systems (FLSs). In recent years, a variety of indices have been proposed to measure the interpretability of FLSs such as the Nauck index and Fuzzy index. However, interpretability indices associated with HFSs have not so far been discussed. The structure of HFSs, with multiple layers, subsystems, and varied topologies, is the main challenge in constructing interpretability indices for HFSs. Thus, the comparison of interpretability between FLSs and HFSs-even at the index level-is still subject to open discussion. This paper begins to address these challenges by introducing extensions to the FLS Nauck and Fuzzy interpretability indices for HFSs. Using the proposed indices, we explore the concept of interpretability in relation to the different structures in FLSs and HFSs. Initial experiments on benchmark datasets show that based on the proposed indices, HFSs with equivalent function to FLSs produce higher indices, i.e. are more interpretable than their corresponding FLSs

    Comparison of Fuzzy Integral-Fuzzy Measure based Ensemble Algorithms with the State-of-the-art Ensemble Algorithms

    Get PDF
    The Fuzzy Integral (FI) is a non-linear aggregation operator which enables the fusion of information from multiple sources in respect to a Fuzzy Measure (FM) which captures the worth of both the individual sources and all their possible combinations. Based on the expected potential of non-linear aggregation offered by the FI, its application to decision-level fusion in ensemble classifiers, i.e. to fuse multiple classifiers outputs towards one superior decision level output, has recently been explored. A key example of such a FI-FM ensemble classification method is the Decision-level Fuzzy Integral Multiple Kernel Learning (DeFIMKL) algorithm, which aggregates the outputs of kernel based classifiers through the use of the Choquet FI with respect to a FM learned through a regularised quadratic programming approach. While the approach has been validated against a number of classifiers based on multiple kernel learning, it has thus far not been compared to the state-of-the-art in ensemble classification. Thus, this paper puts forward a detailed comparison of FI-FM based ensemble methods, specifically the DeFIMKL algorithm, with state-of-the art ensemble methods including Adaboost, Bagging, Random Forest and Majority Voting over 20 public datasets from the UCI machine learning repository. The results on the selected datasets suggest that the FI based ensemble classifier performs both well and efficiently, indicating that it is a viable alternative when selecting ensemble classifiers and indicating that the non-linear fusion of decision level outputs offered by the FI provides expected potential and warrants further study

    Validation of a quantifier-based fuzzy classification system for breast cancer patients on external independent cohorts

    Get PDF
    Recent studies in breast cancer domains have identified seven distinct clinical phenotypes (groups) using immunohistochemical analysis and a variety of unsupervised learning techniques. Consensus among the clustering algorithms has been used to categorise patients into these specific groups, but often at the expenses of not classifying all patients. It is known that fuzzy methodologies can provide linguistic based classification rules to ease those from consensus clustering. The objective of this study is to present the validation of a recently developed extension of a fuzzy quantification subsethood-based algorithm on three sets of newly available breast cancer data. Results show that our algorithm is able to reproduce the seven biological classes previously identified, preserving their characterisation in terms of marker distributions and therefore their clinical meaning. Moreover, because our algorithm constitutes the fundamental basis of the newly developed Nottingham Prognostic Index Plus (NPI+), our findings demonstrate that this new medical decision making tool can help moving towards a more tailored care in breast cancer. © 2016 IEEE

    Interpretability and complexity of design in the creation of fuzzy logic systems — a user study

    Get PDF
    In recent years, researchers have become increasingly more interested in designing an interpretable Fuzzy Logic System (FLS). Many studies have claimed that reducing the complexity of FLSs can lead to improved model interpretability. That is, reducing the number of rules tends to reduce the complexity of FLSs, thus improving their interpretability. However, none of these studies have considered interpretability and complexity from human perspectives. Since interpretability is of a subjective nature, it is essential to see how people perceive interpretability and complexity particularly in relation to creating FLSs. Therefore, in this paper we have investigated this issue using an initial user study. This is the first time that a user study has been used to assess the interpretability and complexity of designs in relation to creating FLSs. The user study involved a range of expert practitioners in FLSs and received a diverse set of answers. We are interested to see whether, from the perspectives of people, FLSs are necessarily more interpretable when they are less complex in terms of their design. Although the initial user study is based on small samples (i.e., 25 participants), nevertheless this research provides initial insight into this issue that motivates our future research

    Nottingham Prognostic Index Plus (NPI+): a modern clinical decision making tool in breast cancer

    Get PDF
    Current management of breast cancer (BC) relies on risk stratification based on well-defined clinicopathologic factors. Global gene expression profiling studies have demonstrated that BC comprises distinct molecular classes with clinical relevance. In this study, we hypothesized that molecular features of BC are a key driver of tumour behaviour and when coupled with a novel and bespoke application of established clinicopathologic prognostic variables, can predict both clinical outcome and relevant therapeutic options more accurately than existing methods. In the current study, a comprehensive panel of biomarkers with relevance to BC was applied to a large and well-characterised series of BC, using immunohistochemistry and different multivariate clustering techniques, to identify the key molecular classes. Subsequently, each class was further stratified using a set of well-defined prognostic clinicopathologic variables. These variables were combined in formulae to prognostically stratify different molecular classes, collectively known as the Nottingham Prognostic Index Plus (NPI+). NPI+ was then used to predict outcome in the different molecular classes with.Seven core molecular classes were identified using a selective panel of 10 biomarkers. Incorporation of clinicopathologic variables in a second stage analysis resulted in identification of distinct prognostic groups within each molecular class (NPI+). Outcome analysis showed that using the bespoke NPI formulae for each biological breast cancer class provides improved patient outcome stratification superior to the traditional NPI. This study provides proof-of-principle evidence for the use of NPI+ in supporting improved individualised clinical decision making

    Combining clustering and classification ensembles: A novel pipeline to identify breast cancer profiles

    Get PDF
    Breast Cancer is one of the most common causes of cancer death in women, representing a very complex disease with varied molecular alterations. To assist breast cancer prognosis, the classification of patients into biological groups is of great significance for treatment strategies. Recent studies have used an ensemble of multiple clustering algorithms to elucidate the most characteristic biological groups of breast cancer. However, the combination of various clustering methods resulted in a number of patients remaining unclustered. Therefore, a framework still needs to be developed which can assign as many unclustered (i.e. biologically diverse) patients to one of the identified groups in order to improve classification. Therefore, in this paper we develop a novel classification framework which introduces a new ensemble classification stage after the ensemble clustering stage to target the unclustered patients. Thus, a step-by-step pipeline is introduced which couples ensemble clustering with ensemble classification for the identification of core groups, data distribution in them and improvement in final classification results by targeting the unclustered data. The proposed pipeline is employed on a novel real world breast cancer dataset and subsequently its robustness and stability are examined by testing it on standard datasets. The results show that by using the presented framework, an improved classification is obtained. Finally, the results have been verified using statistical tests, visualisation techniques, cluster quality assessment and interpretation from clinical experts
    • …
    corecore